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DB=PGPB,USPT.EPAB; PLUR=YES; OP=ADJ 
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LI and label$ 
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LI 
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GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 



November 18, 2005, 16:39:26 ; Search time 237 Seconds 

(without alignments ) 
3450.143 Million cell updates/sec 

US-09-914-698-1 
9514 

1 MELVWSPVLEVACKETLQLI FISSVYAFDTILCKLQIDMF 1861 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched : 



2443163 seqs, 439378781 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



2443163 



Database : 



A Genes 
gene 
gene 
gene 
gene 
gene 
gene 
gene 
gene 
gene 



eq_21:* 
seqpl980s : * 
seqpl990s:* 
seqp2000s:* 
seqp2001s : * 
seqp2002s : * 
seqp2003as:* 
seqp2003bs:* 
seqp2004s : * 
seqp2005s : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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14.3 
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14.3 
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9 
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1317 
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2000 
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9 
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Ady20048 


8 
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7.9 
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8 


ADQ65753 


Adq65753 
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Drosophil 
Drosophil 
Human TEG 
PRO polyp 
PRO polyp 
Novel hum 
PRO polyp 
Novel hum 



9 


732 


7 . 


7 


1752 


8 


ADT71531 


Adt71531 


i 1 Ul L lu i. 1 w L/ 
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1 7 
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1 R 
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3 . 
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,rtX/lX \J T W J U 
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L. Xp o (J X X 
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3 . 


2 


1374 
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AnTl9 04 "^9 


AHn9 04 "^9 


/\ • UildXXct 
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3 . 


2 


1381 


8 
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A thpi 1 i a 

• L-lldXXCl 
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\ 


4684 


8 


AnH0Q<^04 
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i Z7 -7 W *t 


n U.X L Id X 1 1 1 (J o 
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3 . 
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Human mit 
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2482 
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AAW23996 


Aaw23996 


Human mit 


42 
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2. 
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ADN95402 
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Human BEC 


43 
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Human oes 


44 
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45 


268 


2. 


8 


3113 


8 
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Human sof 



ALIGNMENTS 



RESULT 1 
AAY90350 

ID AAY90350 standard; protein; 1861 AA. 
XX 

AC AAY90350; 
XX 

DT 04-DEC-2000 (first entry) 
XX 

DE Drosophila Asp protein sequence. 
XX 

I<W Asp; Drosophila; microtubule organising centre; MTOC; mitosis inhibitor; 

KW tumour cell. 

XX 

OS Drosophila sp. 
XX 



PN WO200052478-A1. 
XX 

PD 08-SEP-2000. 
XX 

PF 03-MAR-2000; 2000WO-GB000785 . 
XX 

PR 04-MAJ^-1999; 99GB-00005007 . 
XX 

PA (UYDU-) UNIV DUNDEE. 
XX 

PI Glover DM, Avides MDC; 
XX 

DR WPI; 2000-594203/56. 

DR N-PSDB; AAA37761. 
XX 

PT Use of Drosophila Asp polypeptide for identifying substances capable of 

PT disrupting microtubule organizing center integrity and use of the 

PT identified substances for inhibiting mitosis in tumor cell. 
XX 

PS Claim 4; Page 43-44; 51pp; English. 
XX 

CC This sequence represents the Drosophila Asp protein. The invention 

CC relates to the use of Drosophila Asp polypeptide (or its homologue, or 

CC fragment) capable of stimulating formation and/or maintenance of 

CC microtubule organising centres (MTOCs), in an assay for identifying a 

CC substance capable of disrupting MTOC integrity. Asp polypeptide or its 

CC homolog is useful for identifying a substance capable of disrupting MTOC 

CC integrity. Substances identified by the method can be used to inhibit 

CC mitosis, e.g. in tumour cells 

XX 

SQ Sequence 1861 AA; 

Query Match 100.0%; Score 9514; DB 3; Length 1861; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1861; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 60 

Qy 61 GAGKTMKSWSAAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GAGKTMKSWSAAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 120 

Qy 121 PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 121 PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 180 

Qy 181 PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 240 

Qy 241 PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
Db 241 PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 300 



Qy 



301 LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 360 



Db 301 LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 360 

Qy 361 FLFNHSEIIoAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYFOaETVAI 420 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 420 

Qy 421 SPPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 SPPKKQRVEDTTLPRSAAPANASARSSSTUiAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 4 80 

Qy 481 TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 540 

Qy 541 VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 600 

Qy 601 SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 660 

I I I I I I I I M I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDVVMQRTILELLLCFNPLWLRLGLEWFGE 660 

Qy 661 KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLPFL 720 

I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 
Db 661 KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLPFL 720 

Qy 721 DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 780 

Qy 781 LDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 LDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 840 

Qy 841 EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 841 EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 900 

Qy 901 RRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQLYQ 960 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n I I I I I I I I I M I M 

Db 901 RRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQLYQ 960 

Qy , 961 SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 961 SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 1020 

Qy 1021 QKQQAAASYIQMQWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 1080 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I 
Db 1021 QKQQAAASYIQMQWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 1080 

Qy 1081 AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 1140 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 1081 AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 1140 

Qy 1141 RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 1200 

I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1141 RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 1200 

Qy 1201 QMRRERKNYLHLQTTTKRIQIKFRAKRE^yiKKQRAEFLQLKKVTLVVQKRRRALLQMRKER 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 QMRRERKNYLHLQTTTKRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 1260 

Qy 12 61 QEYLHLREVTIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQL 1320 

I M I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 QEYLHLREVTIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQL 1320 

Qy 1321 RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 1380 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 1321 RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 1380 

Qy 1381 VQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWR 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 VQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWR 1440 

Qy 1441 ATVQARRQREIFLSTIRKVRLMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAML 1500 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 ATVQARRQREIFLSTIRKVRLMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAML 1500 

Qy 1501 KARQDYQLIQSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCF 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1501 KARQDYQLIQSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCF 1560 

Qy 1561 QLLRCSMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQ 1620 

I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 QLLRCSMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQ 1620 

Qy 1621 GLLDIRKRIAQLRQEAKAVNSVRCKVQEAVRFLRGRFIASDALAVLSQLDRLSRTVPHLL 1680 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 GLLDIRKRIAQLRQEAKAVNSVRCKVQEAVRFLRGRFIASDALAVLSQLDRLSRTVPHLL 1680 

Qy 1681 MWCSEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQM 1740 

I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 MWCSEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQM 1740 

Qy 1741 LLRWCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I 
Db 1741 LLRWCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 1800 

Qy 1801 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1860 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 1801 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1860 

Qy 1861 F 1861 

I 

Db 1861 F 1861 

RESULT 2 
ABB62757 

ID ABB62757 standard; protein; 1954 AA. 
XX 

AC ABB62757; 
XX 



DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 15063. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 

KW pharmaceutical . 

XX 

OS Drosophila melanogaster. 
XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US009231 . 
XX 

PR 23-MAR-2000; 2000US-0191637P . 

PR ll-JUL-2000; 2000US-00614150 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL06860. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signaling and cell-cell 

PT interactions . 
XX 

PS Disclosure; SEQ ID NO 15063; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences (ABL1617 6-ABL30511 ) , expressed DNA 

CC sequences (ABL01840-ABL16175) and the encoded proteins (ABB57737- 

CC ABB72072) . The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 1954 AA; 

Query Match 99.7%; Score 9482; DB 4; Length 1954; 
Best Local Similarity 99.7%; Pred. No, 0; 

Matches 1855; Conservative 4; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

Db 94 MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 153 

Qy 61 GAGKTMKSWSAAVQQKKRMS7VAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I 

Db 154 GAGKTMKSWSTVAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 213 



Qy 121 PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 180 



I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 214 PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 273 

Qy 181 PTLKGr^SCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 240 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 274 PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 333 

Qy 241 PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I M I I 
Db 334 PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 393 

Qy 301 LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 394 LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 453 

Qy 361 FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 420 

M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
Db 454 FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 513 

Qy 421 SPPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 480 

I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
Db 514 SPPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 573 

Qy 481 TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 574 TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 633 

Qy 541 VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 634 VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 693 

Qy 601 SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 660 

I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I M I I M 
Db 694 SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 753 

Qy 661 KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLPFL 720 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 754 KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLLFL 813 

Qy 721 DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 780 

I M I I II M I I I II II I I I I I I I I I I I II II I I I I I I I I I I I I I II I I I I I II I I I I I I I 
Db 814 DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 873 

Qy 781 LDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 84 0 

I I I I I I I I I I I I II I II I I I I I I I : I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I 

Db 874 LDEFDYAFNNLAVDLRDGVRLTRVMEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 933 

Qy 841 EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 900 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 993 

Qy 901 RRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQLYQ 960 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 994 RRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQLYQ 1053 

Qy 961 SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 1020 

M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1054 SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 1113 

Qy 1021 QKQQAAASYIQMQWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 1080 

I I I I I I I I I I I I I I I : I I I I I I I I : I I I I M I I I I M I I I I I I I M I I I I I I I I I I I I I 
Db 1114 QKQQAAASYIQMQWRSYQLGRIQRQQFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 1173 

Qy 1081 AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 114 0 

I I I I I I I I M I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 1233 

Qy 1141 RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 1200 

M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I 
Db 1234 RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 1293 

Qy 1201 QMRRERKNYLHLQTTTKRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 1294 QMRRERKNYLHLQTTTKRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 1353 

Qy 1261 QEYLHLREVTIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQL 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M M M I I I I I I I I I I I I I I 
Db 1354 QEYLHLREVTIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQL 1413 

Qy 1321 RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1414 RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 1473 

Qy 1381 VQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWR 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I M I I I I I I 

Db 1474 VQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWR 1533 

Qy 1441 ATVQARRQREIFLSTIRKVRLMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAML 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1534 ATVQARRQREIFLSTIRKVRLMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAML 1593 

Qy 1501 KARQDYQLIQSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCF 1560 

I I I I I M I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 1594 KARQDYQLIQSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCF 1653 

Qy 1561 QLLRCSMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQ 1620 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1654 QLLRCSMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQ 1713 

Qy 1621 GLLDIRKRIAQLRQEAKAVNSVRCKVQEAVRFLRGRFIASDALAVLSQLDRLSRTVPHLL 1680 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I M I I M 
Db 1714 GLLDIRKRIAQLRQEAKAVNSVRCKVQEAVRFLRGRFIASDALAVLSRLDRLSRTVPHLL 1773 

Qy 1681 MWCSEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQM 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1774 MWCSEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQM 1833 

Qy 1741 LLRWCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 1800 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1834 LLRWCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 1893 

Qy 1801 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1860 

I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

Db 1894 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1953 



Qy 1861 F 1861 

I 

Db 1954 F 1954 



RESULT 3 
ADX83191 

ID ADX83191 standard; protein; 3477 AA. 
XX 

AC ADX83191; 
XX 

DT 05-MAY-2005 (first entry) 
XX 

DE Human TEG14 polypeptide SEQ ID NO 73. 
XX 

KW cytostatic; protein purification; diagnosis; cancer; cytostatic; 

KW neoplasm; respiratory disease; lung tumor; gastrointestinal disease; 

KW stomach tumor; colon tumor; hepatic tumor; selectable marker; TEG. 
XX 

OS Homo sapiens. 
XX 

PN WO2005014818-A1. 
XX 

PD 17-FEB-2005. 
XX 

PF 06-AUG-2004; 2004WO- JP011650 . 
XX 

PR 08-AUG-2003; 2003 JP-00290704 . 
XX 

PA (PERS-) PERSEUS PROTEOMICS INC. 

PA (CHUS ) CHUGAI SEIYAKU KK. 

PA (ABUR/) ABURATANI H. 
XX 

PI Aburatani H, Hippo Y, Taniguchi H, Chen YX, Ishikawa S; 

PI Fukumoto S, Shimamura T, Kamimura N, Guo YQ, Yamamoto S, Ito Y; 

PI Ito H, Ohtomo T; 

XX 

DR WPI; 2005-173106/18. 

DR N-PSDB; ADX83135. 
XX 

PT Novel protein encoded by any one of TEGl to TEG64, useful for diagnosing 

PT and treating cancer e.g. lung, hepatic, stomach, colon or pancreatic 

PT cancer. 
XX 

PS Disclosure; SEQ ID NO 73; 462pp; Japanese. 
XX 

CC The invention describes a protein (I) encoded by a gene having a 

CC nucleotide sequence chosen from any one of 65 fully defined 418-19341 

CC base pair sequences (SEQ ID No. 1-65) (TEG 1-64) or their fragments. Also 

CC described are: a protein (II) encoded by a gene having a nucleotide 

CC sequence chosen from SEQ ID No. 1, 2, 28, 29, 30, 31, 32, 51, 52, 60 and 

CC 61 or their fragments; a protein (III) encoded by a gene having a 

CC nucleotide sequence chosen from any one of SEQ ID No. 3-13, 22-27 and 33- 

CC 55 or their fragments; protein (IV) encoded by a gene having a nucleotide 

CC sequence chosen from SEQ ID No. 3, 7, 20, 21 and 46-50 or their fragments 

CC ; a protein (V) encoded by a gene having a nucleotide sequence chosen 



CC from SEQ ID No. 14-19, 43-45, 56-59 and 62-65 or their fragments; an 

CC antibody (VI) that specifically recognizes any one of (I)-(V) or their 

CC fragments; a polynucleotide (VII) complementary to the nucleotide 

CC sequence of any one of SEQ ID No. 1-65 or a polynucleotide sequence 

CC capable of hybridizing with SEQ ID No. 1-65; a polynucleotide (VIII) 

CC comprising at least 12 consecutive nucleotides in any one SEQ ID No. 1-65 

CC or a polynucleotide sequence capable of hybridizing with this nucleotide; 

CC a composition (CI) for diagnosing or treating lung cancer; a composition 

CC (C2) for diagnosing or treating stomach cancer; composition (C3) for 

CC diagnosing or treating colon cancer; a composition ( C4 ) for diagnosing or 

CC treating hepatic cancer; a vector (IX) comprising (VII) or (VIII) ; a cell 

CC (X) comprising (IX) ; identifying (Ml) a compound having anticancer 

CC activity; and diagnosing (M2) cancer. Proteins (I)-(V), an antibody (VI) 

CC that specifically binds the proteins and polynucleotides (VII) and (VIII) 

CC are useful for diagnosing cancer. (M2) comprising measuring the 

CC expression level of (I)-(V) or (VII) or (VIII), or obtaining sample 

CC (blood serum or plasma) , and detecting C20orfl02 protein in the obtained 

CC sample is also useful for diagnosing cancer such as lung cancer, hepatic 

CC cancer or pancreatic cancer, where the C20orfl02 protein is a secreted or 

CC extracellular C20orfl02 protein, which is detected using an antibody 

CC which recognizes C20orfl02 protein. A composition (CI) comprising protein 

CC (II) is useful for diagnosing or treating lung cancer. A composition (C2) 

CC comprising protein (III) is useful for diagnosing or treating stomach 

CC cancer. A composition (C3) comprising protein (IV) is useful for 

CC diagnosing or treating colon cancer. A composition (C4) comprising (V) is 

CC useful for diagnosing or treating hepatic cancer. (Ml) comprising 

CC contacting a cultured human cell with a test compound and identifying a 

CC compound that causes a change in expression level of the gene which 

CC contains the nucleotide sequence in any one of SEQ ID No. 1-65 is useful 

CC for identifying a compound having anticancer activity. A vector (IX) or 

CC cell (X) is useful for producing (I) or (VI) . An antibody (VI) is useful 

CC as a cancer diagnostic marker. This is the amino acid sequence of a human 

CC TEG polypeptide. 

XX 

SQ Sequence 3477 AA; 

Query Match 14.3%; Score 1360; DB 9; Length 3477; 

Best Local Similarity 22.7%; Pred. No. 4.6e-93; 

Matches 480; Conservative 360; Mismatches 630; Indels 640; Gaps 68; 

Qy 5 WSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPTGAGK 64 

I : I : I : I : : I : I I : I : I I : I : 
Db 110 WTPLKEGRVREIMTFLVN-DVLKHQAILLGNAEEQKKKKRSLWDTI 154 

Qy 65 TMKSWSAAVQQKKRMS AAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYK 119 

I : I I : : I : I | : : | | : | : III 

Db 155 -KKKKISASTSHNRRVSNIQNVNKTFSVSQKVDRVRSPLQACENLAMNEGGPPTENNSL- 212 

Qy 120 TPQEEPVYISP QPRSLK ENLSPMTPGNLLDV IDNLRFT 157 

: I : I I I II:: : I II : I : : I 

Db 213 ILEENKIPISPISPAFNECHGATCLPLSVRRSTTYSSLHASENRELLNVHSANVSKVSFN 272 

Qy 158 — PLTET RGKGQATIFPDNLAAWPTPTLKGNVKSCAN — DMRPRRITPDD 203 

: I II II: : I I I I : : : I I 

Db 273 EKAVTETSFNSVNVNGQRGENSKL SLTPNCSSTLNITQSQIHFLSPDS 320 



Qy 



204 LEDQPATNKTFDVKHSETINISLDTLDCSRID GQPHTPLNKTTTIVHATHTRALACI 260 



I : I I I : : I I | :: : M | : : | 
Db 321 F VNNSHGANNELELVTCLSSDMFMKDNSQPVHLESTIAHEIYQKIL 366 

Qy 261 HEEEGPSPPRTPTKSAIHD LKRDIKLVG-SPLRKYSESMKDLSLLSPQTKYAIQGSM 316 

II III I : I : : : I : : : : I I : : : : 

Db 367 SP DSFIKDNYGLNQDLESESVNPILSPNQFLKDNMAYMCTSQQTCKVPL 415 

Qy 317 PNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQEFLFNHSEILAQSS-RF 375 

II :| I ::::::: | j :| | j: : j:: :| 

Db 416 SNENSQVPQSPED WRKSEVSPRI PECQGSKSPKAIFEELVEMKSNYYSFIKQNNPKF 472 

Qy 37 6 N-LHEVGRKSVKGSPVKNP HKRRS HELSF 4 03 

: : :: I I : I I I :: M 

Db 473 SAVQDISSHSHNKQPKRRPILSATVTKRKATCTRENQTEINKPKAKRCLNSAVGEHEKVI 532 

Qy 404 SDAPSNESL YRNE TVAISPPKKQ — RVEDTTLPRSAAP 439 

: : I I : I I I : : : : j : II : I I 

Db 533 NNQKEKEDFHSYLPIIDPILSKSKSYKNEVTPSSTTASVARKRKSDGSMEDANV-RVAIT 591 

Qy 440 ANASARS-SSAHAWPHAQ SKKFKLAQTMSLMKKP ATPRKV 478 

: I II II: : : : I I I II 

Db 592 EHTEVREIKRIHFSPSEPKTSAVKKTKNVTTPISKRISNREKLNLKKKTDLSIFRTPISK 651 

Qy 479 RDTSIQPSVKLYDSELYMQTCINPDPFAATTTID PFLASTMYLDEQAVDRHQADFK 534 

: : I : : I I III I I || I I : II : :: : I 

Db 652 TNKRTKPIIAVAQSSL TFIKP LKTDIPRHPMPFAAKNMFYDERWKEKQEQGFT 7 04 

Qy 535 KWLNALVSIPADLDADLN-NKIDVGKLFNEVRNKELW APTKEEQSMN-YLTKYRLE 589 

III ::: I I I :::: I : |: : IIIIM |: I : II 

Db 705 WWLNFILT-PDDFTVKTNISEVNAATLLLGIENQHKISVPRAPTKEEMSLRAYTARCRLN 763 

Qy 590 TLRKAAVELFFSEQMRLPCSKVAVYWKQALRIRSDRNLHLDVVMQRTILELLLCFNPLW 649 

11:11 I I I I : I I : : : : I : I I I : I II : : : I I I : I I II 

Db 7 64 RLRRAACRLFTSEKMVKAIKKLEIEIEARRLIVRKDRHLWKDVGERQKVLNWLLSYNPLW 823 

Qy 650 LRLGLEWFGEKIQMQSNRDIVGLSTFILNRLFRN KCEEQRYSKAYTLTEEYAETIK 706 

I I : I M : I I I : : I I : M : I I II I I I I : : : : : I : 

Db 824 LRIGLETTYGELISLEDNSDVTGLAMFILNRLLWNPDIAAEYRHPTVPHLYRDGHEEALS 883 

Qy 7 07 KHSLQKILFLLPFLDQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRE 7 66 

I : I : I : I I : I I I I I I : : I : I I II I : I : I : I I I II : I : I I : : I 
Db 884 KFTLKKLLLLVCFLDYAKISRLIDHDPCLFCKDAEFKASKEILLAFSRDFLSGEGDLSRH 943 

Qy 7 67 LRRLGYVLQHRQTFLDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRL 826 

I II : I II I II I : I I M I I I : M II I : I : : I I : : : I I : I I I I I I 
Db 94 4 LGLLGLPVNHVQTPFDEFDFAVTNIiAVDLQCGVRLVRTMELLTQNWDLSKKLRIPAISRL 1003 

Qy 827 QRIFNVKLALGALGEANFQL GGDIAAQDIVDGHREKTLSLLWQLIYKFR 875 

I : : I I : I I : I I I : : M II I M M I I II : : : I : 

Db 1004 QKMHNVDIVLQVLKSRGIELSDEHGNTILSKDIVDRHREKTLRLLWKIAFAFQVDISLNL 1063 

Qy 876 875 

Db 1064 DQLKEEIAFLKHTKSIKKTISLLSCHSDDLINKKKGKRDSGSFEQYSENIKLLMDWVNAV 1123 



Qy 



876 



875 



Db 



1124 CAFYNKKVENFTVSFSDGRVLCYLIHHYHPCYVPFDAICQRTTQTVECTQTGSWLNSSS 1183 



Qy 876 SPKFH 880 

I I 

Db 1184 ESDDSSLDMSLKAFDHENTSELYKELLENEKKNFHLVRSAVRDLGGIPAMINHSDMSNTI 1243 

Qy 881 AAATVLQKWWRRHWLHWIQRRIRHKELMRRHRAAT 916 

I I :: I I I :: I :: I I : I I : M 
Db 1244 PDEKWITYLSFLCARLLDLRKEIRAARLIQTTWRKYKLKTDLK RHQE REKAAR 1297 

Qy 917 VIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQLYQ 960 

: I I : : : : : I I I : : : I I : I I I I I : : I 
Db 1298 IIQLAVINFLAKQRLR KRVNAALVIQKYWRRVLAQRKLLMLKKEKLEKVQNKAASL 1353 

Qy 961 SYHSII TIQRWWRAQQLGRQH 981 

hill II I I I II : I 

Db 1354 IQGYWRRYSTRQRFLKLKYYSIILQSRIRMIIAVTSYKRYLWATVTIQRHWRAYLRRKQD 1413 

Qy 982 RQRFVELREAAIFLQ RIWRRRLFAKKLLAAAETAR LQRSQKQQAAASYIQM 1032 

: I I : !::::! I I : : I : : I I I : : I : : : I II 

Db 1414 QQRYEMLKSSTLIIQSMFRKWKQRKMQSQVKATVILQRAFREWHLRKQAKEENSAIIIQS 1473 

Qy 1033 QWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRAAINIQQRWRAKL 1092 

: I : : : : : : I : : : | : | | : : | : : : | : : | | : : : | | 

Db 1474 WYRMHK ELRKYI YIRSCWIIQKRFR CFQAQKLYKRRKESILTIQKYYKAYL 1525 

Qy 1093 SMRKCNADYLALRSSVLKVQA YRKATI QMRIDRNHYYSLRKN 1134 

: : I I I : : : : : I I | | : : M I I : : I : I 

Db 1526 KGKIERTNYLQKRAAAIQLQAAFRRLKAHNLCRQIRAACVIQSYWRMRQDRVRFLNLKKT 1585 

Qy 1135 VICLQQRLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQR 1194 

: I : : I : : : | | : | : : : | : : : : | : | : : : I : I I : I : I 

Db 1586 II KFQAHVRKHQQRQKYKKMKKAAVIIQTHFRAYIFAMKVLASYQKTRSAVIVLQS 1641 

Qy 1195 RWRATLQMRRERKNYLHLQTTTKRIQIKFRA KRE MKKQRAE 1235 

: I : I I I : I : I : : M : I I hi I I : I : 

Db 1642 AYRG MQARKMYIHILTSVIKIQSYYRAYVSKKEFLSLKNATIKLQSTVKMKQTRKQ 1697 

Qy 1236 FLQLKKVTLWQKRRRALLQMRKERQEYLHLREVTIKLQ RRFHAQKSMRFMRAKYRG 1292 

: I I : I : I : I : : : I : | I : : | | Mil I : : I I I I 
Db 1698 YLHLRAAALFIQQCYRSKKIAAQKREEYMQMRESCIKLQAFVRGYLVRKQMRLQR 1752 

Qy 1293 TQAAVSCLQMHWRNHLLRKRERNSFLQLRQAAITLQRRYRARLNMIKQLKSYAQLKQAAI 1352 

II I I : : I : : I : I : : : I I : I II : | | : : | : | : | | 

Db 1753 —KAVISLQSYFR MRKARQYYLKMYKAIIVIQNYYHAYKAQVNQRKNFLQVKKAAT 1806 

Qy 1353 TIQTRYRAKKAMQKQVVLYQKQREAIIKVQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWW 1412 

: I II I I I : : I I : I : I : I I I : : : I I I : : I : : | : | : 

Db 1807 CLQAAYRGYKVRQ LIKQQSIAALKIQSAFRG YNKRVK-YQSVLQSIIKIQRWY 1858 

Qy 1413 RSIRDMRLCKAGYRRIRLSSLSIQ RKWRATVQARRQREIFLSTIRKVRLMQAFIRAT 1469 

I : : : : : : : : : I : I I I : III:: 
Db 1859 RAYKTLHDTRTHFLKTKAAVISLQSAYRGWKVRKQIRREHQ 1899 

Qy 1470 LLMRQQRREFEMKRRAAWIQRRFRARCAMLKARQDYQLIQSSVILVQRKFRANRSMKQA 1529 

I I : I I II I I I : : : : I : : : : : : I : I II : : : 

Db 1900 AALKIQSAFR MAKAQKQFRLFKTAALVIQQNFRAWTAGRKQ 194 0 



Qy 1530 RQEFVQLRTIAVHLQQKFRGK RLMIEQRNCFQLLRCSMPGFQARARGFMARKRFQAL 1586 

I : : : I I : I I : : I I I : I I : : I : I : : I : : : : 

Db 1941 CMEYIELRHAVLVLQSMWKGKTLRRQLQRQHKCAIII QSYYRMHVQQKKWKIM 1993 

Qy 1587 MTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQGL LDIRKRIAQ 1631 

I : I I : I I : I : I I I I I I I : : I I I I 

Db 1994 KKAALLIQKYYRAYSIGREQNHLYLKTKAAWTLQSAYRGMKVRKRIKD 2042 

Qy 1632 LRQEAKAVNS 1641 

: I : I 
Db 2043 CNKAAVTIQS 2052 



GenCore version 5.1,6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: November 18, 2005, 17:09:58 



; Search time 22 0 Seconds 
(without alignments) 
3534.456 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-09-914-698-1 
9514 

1 MELVWSPVLEVACKETLQLI FISSVYAFDTILCKLQIDMF 1861 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



1867569 



Searched: 1867569 seqs, 417829326 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Published_Applications_AA_Main : * 

1 : /cgn2_6/ptodata/l/pubpaa/US07_PUBCOMB . pep : * 

2 : /cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB . pep : * 

3: /cgn2_6/ptodata/l/pubpaa/US09_PUBCOMB.pep: * 

4 : /cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB.pep: * 

5: /cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep: * 

6: /cgn2_6/ptodata/l/pubpaa/USll_PUBCOMB.pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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Sequence 


82, Appl 
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Sequence 
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us- 


10- 


012- 


697-1548 


Sequence 


1548, Ap 
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Sequence 
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Sequence 
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us- 
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143-13074 


Sequence 


13074, A 
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us- 
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S equence 


260, App 
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us- 
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26 


262 . 5 


2 


8 


2568 


5 


us- 


10- 


828- 


9 8 5 A- 7 
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7 
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5 
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10- 
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763-36660 


S ecruence 


36660 A 


31 


254 . 5 


2 


7 
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4 


us- 
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493-22734 


Sequence 


22734, A 


32 


252.5 


2 


7 
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5 


us- 


10- 


977- 
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S e mi p n p 

^ U ^ XX v_>^ 


33 ADnl 


33 
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6 


us- 


11- 


097- 


143-40116 


Secruence 


40116, A 


34 


250.5 
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6 
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4 


us- 
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408- 
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Secruence 

k,^ si V.^ U ^ X 1 
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35 
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36 
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6 
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5 
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977- 
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SecTiJ en ce 


3 2 Ann 1 


37 
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6 


2101 


5 


us- 


10- 


723- 


681-18 


Sequence 


18, Appl 


38 


249.5 


2 


6 


1439 


5 


us- 


10- 


754- 


079-157 


Sequence 


157, App 


39 


249.5 


2 


6 


2503 


5 


us- 


10- 


828- 


985A-11 


Sequence 


11, Appl 


40 


247.5 


2 


6 


2246 


5 


us- 


10- 


450- 


763-36209 


Sequence 


36209, A 


41 


244.5 


2 


6 


1401 


4 


us- 


10- 


408- 


765A-2125 


Sequence 


2125, Ap 


42 


239.5 


2 


5 


4365 


5 


us- 


10- 


472- 


928-3660 


Sequence 


3660, Ap 


43 


238.5 


2 


5 


1489 


6 


us- 


11- 


097- 


143-6636 


Sequence 


6636, Ap 


44 


238.5 


2 


5 


1893 


4 


us- 


10- 


408- 


765A-1696 


Sequence 


1696, Ap 


45 


238.5 


2 


5 


3674 


4 


us- 


10- 


291- 


265-454 


Sequence 


454, App 



ALIGNMENTS 



RESULT 1 

US-11-097-143-15063 

; Sequence 15063, Application US/11097143 

; Publication No. US20050208558A1 

; GENERAL INFORMATION: 

; APPLICANT: Venter, J. Craig 

; APPLICANT: et al . 

; TITLE OF INVENTION: DETECTION KIT, SUCH AS NUCLEIC ACID 

; TITLE OF INVENTION: ARRAYS, FOR DETECTING EXPRESSION OF 10,000 OR MORE 

; TITLE OF INVENTION: DROSOPHILA GENES. 

; FILE REFERENCE: CL00072 8 

; CURRENT APPLICATION NUMBER: US/11/097,143 

; CURRENT FILING DATE: 2005-04-04 

; PRIOR APPLICATION NUMBER: 60/157,832 

; PRIOR FILING DATE: 1999-10-05 

; PRIOR APPLICATION NUMBER: 60/160,191 



PRIOR FILING DATE: 1999-10-19 
PRIOR APPLICATION NUMBER: 60/161,932 
PRIOR FILING DATE: 1999-10-28 
PRIOR APPLICATION NUMBER: 60/164,769 
PRIOR FILING DATE: 1999-11-12 
PRIOR APPLICATION NUMBER: 60/173,383 
PRIOR FILING DATE: 1999-12-28 
PRIOR APPLICATION NUMBER: 60/175,693 
PRIOR FILING DATE: 2000-01-12 
PRIOR APPLICATION NUMBER: 60/184,831 
PRIOR FILING DATE: 2000-02-24 
PRIOR APPLICATION NUMBER: 60/191,637 
PRIOR FILING DATE: 2000-03-23 
NUMBER OF SEQ ID NOS : 43008 
SOFTWARE: FastSEQ for Windows Version 4,0 
SEQ ID NO 15063 
LENGTH: 1954 
TYPE: PRT 

ORGANISM: DROSOPHILA 
US-11-097-143-15063 

Query Match 99.7%; Score 9482; DB 6; Length 1954; 

Best Local Similarity 99.7%; Pred. No. 0; 

Matches 1855; Conservative 4; Mismatches 2; Indels 0; Gaps 

MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 60 
I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 153 

GAGKTMKSWSAAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M 
GAGKTMKSWSAAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 213 

PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 180 
I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 273 

PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 240 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 333 

PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 300 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 393 

LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 360 
I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 453 

FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 420 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 513 



0; 




Qy 


1 


Db 


94 


Qy 


.61 


Db 


154 


Qy 


121 


Db 


214 


Qy 


181 


Db 


274 


Qy 


241 


Db 


334 


Qy 


301 


Db 


394 


Qy 


361 


Db 


454 



Qy 



421 



S PPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 480 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 



Db 


514 


Qy 


481 


Db 


574 


Qy 


541 


Db 


634 


Qy 


601 


Db 


694 


Qy 


661 


Db 


754 


Qy 


721 


Db 


814 


Qy 


781 


Db 


874 


Qy 


841 


Db 


934 


Ov 


901 


Db 


994 


1053 




Qy 


961 






Db 


1054 


1113 




Qy 


1021 






Db 


1114 


1173 




Qy 


1081 


1140 




Db 


1174 


1233 




Qy 


1141 


1200 





514 SPPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 573 

TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 540 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I 
TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 633 

VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 600 
I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 693 

SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 660 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 753 

KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLPFL 720 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II 
KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLLFL 813 

DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 780 

II I II I I I I II I I II I I II I II I II I I II I M I I 11 I I II I II I I I I II I I II I M I I I I 
DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 873 

LDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 840 
I I I I I I II I M I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
LDEFDYAFNNLAVDLRDGVRLTRVMEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 933 

EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 900 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 993 

RRI RHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAI I LQKFTRRYLAQKQLYQ 960 
I I I I I I I I I I I I II I II I II I I I I I I I I I I M I I I I I I I I II I I II I I I I I I I I I II I II 
RRI RHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAII LQKFTRRYLAQKQLYQ 



SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I 

SYHSIITIQRWWRAQQLGRQHRQRFVELREAAI FLQRIWRRRLFAKKLLAAAETARLQRS 



QKQQAAASYIQMQWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 

I I I I I I I I I I I I I I I : M I I I I I I : I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I I 
QKQQAAASYIQMQWRSYQLGRIQRQQFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 



AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I 
AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1234 RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 

1293 

Qy 1201 QMRRERKNYLHLQTTTKRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 

1260 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
Db 1294 QMRRERKNYLHLQTTTKRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 

1353 

Qy 12 61 QEYLHLREVTIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQL 

1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
Db 1354 QE YLHLREVTI KLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNS FLQL 

1413 

Qy 1321 RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 

1380 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I 

Db 1414 RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 

1473 

Qy 1381 VQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWR 

1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1474 VQRRYRGNLEMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWR 

1533 

Qy 1441 ATVQARRQREIFLSTIRKVRLMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAML 

1500 

I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 1534 ATVQARRQRE I FLST I RKVRLMQAFI RATLLMRQQRREFEMKRRAAWI QRRFRARCAML 

1593 

Qy 1501 KARQDYQLIQSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCF 

1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1594 KARQDYQLIQSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCF 

1653 

Qy 1561 QLLRCSMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQ 

1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

Db 1654 QLLRCSMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQ 

1713 

Qy 1621 GLLDIRKRIAQLRQEAKAVNSVRCKVQEAVRFLRGRFIASDALAVLSQLDRLSRTVPHLL 

1680 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I : I I I I I I I I I I I I 
Db 1714 GLLDIRKRIAQLRQEAKAVNSVRCKVQEAVRFLRGRFIASDALAVLSRLDRLSRTVPHLL 

1773 

Qy 1681 MWCSEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQM 

1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 1774 MWCSEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQM 

1833 



Qy 1741 LLRWCDKDSEIFNTLCTLIWFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 

1800 

I I M I I I I I I I I I I I I I M I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

Db 1834 LLRWCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 

1893 

Qy 1801 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 

1860 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 1894 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 

1953 

Qy 1861 F 1861 

I 

Db 1954 F 1954 



RESULT 2 

US-10-108-260A-3399 

Sequence 3399, Application US/10108260A 
Publication No. US2004 0005560A1 
GENERAL INFORMATION: 
APPLICANT: HELIX RESEARCH INSTITUTE 

TITLE OF INVENTION: No. US20040005560Alel full length cDNA 
FILE REFERENCE: H1-A0106 

CURRENT APPLICATION NUMBER: US/ 10/ 108 , 2 60A 
CURRENT FILING DATE: 2002-03-27 
NUMBER OF SEQ ID NOS : 5458 
SOFTWARE: Patentin Ver. 2.1 
SEQ ID NO 3399 
LENGTH: 898 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-108-260A-3399 

Query Match 7.1%; Score 672.5; DB 4; Length 898; 

Best Local Similarity 25.2%; Pred. No. l.le-39; 

Matches 236; Conservative 182; Mismatches 309; Indels 209; Gaps 



30; 




Qy 


850 


Db 


43 


Qy 


896 


Db 


99 


Qy 


956 


1015 




Db 


152 



-REKTLSLLWQLIYKFRSP KFHAA?VTVLQKWWRRHWL 895 

I: :|l I :: I I : I |:| :: I 



III : II : I I : : : I : 



956 KQLYQSYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETA 
I I I I I I I : I I : I I : : : I I II : | | : : : | 



Qy 

1058 



1016 



RLQRSQKQ QAAASYIQMQWRTYQLGRIQRHEFLRQRDLIMFVQ' 



I ' * • I I I -11 11 • I • III I i ; I ; ; ; I 
Db 212 LKIQSAFRMAKAQKQFRLFKTAALVIQQNFRAWTAGRKQCMEYIELRHAVLVLQSMWKGK 271 

Qy 1059 — RR MRSKWSMLEQRKEFQQLKRAAINIQQRWRAKLSMRKCNADYLALR 

1105 

II : : I : I I : I : : : : I : I I : I I : : II I : I I I : 

Db 272 TLRRQLQRQHKCAIIIQSYYRMHVQQKKWKIMKKAALLIQKYYRAYSIGREQNHLYLKTK 331 

Qy 1106 SSVLKVQ-AYRKATIQMRI DRNHYYSLRKNVICLQQRLRAIM 

1146 

::|: :| III :: II : I : I : I : I I I 

Db 332 AAWTLQSAYRGMKVRKRIKDCNKAAVTIQSKYRAYKTKKKYATYRASAIIIQRWYRGIK 391 

Qy 1147 KMREQRENYLRLRNASILVQKRY RMRQ— QMIQDRNAYLRTRK 

1187 

I : M I: : I : I I I : I : : I I I : I II 

Db 392 ITNHQHKEYLNLKKTAIKIQSVYRGIRVRRHIQHMHRAATFIKAMFKMHQSRISYHTMRK 451 

Qy 118 8 CIINVQRRWRATLQMRRERKNYLHLQTTTKRIQIKF RAKREMKK 

1231 

I : I I II I : : I : II : I : I I I : I : : I 

Db 452 AAIVIQVRCRAYYQGKMQREKYLTILKAVKVLQASFRGVRVRRTLRKMQTAATLIQSNYR 511 

Qy 1232 QRAEFLQLKKVTLWQKRRRALLQMRKERQEYLHLREVTIKLQRRFHAQKSMRFMRA 

1288 

I : I : I I I : I II : I I : : : I I M I : I I : I : I : : 
Db 512 RYRQQTYFNKLKKITKTVQQRYWAMKERNIQFQRYNKLRHSVIYIQAIFRGKKARRHLKM 571 

Qy 128 9 KYRGTQAAVSCLQMHWRNHLLRKRERNSFLQLRQAAITLQRRYRARLNMIKQLKSYAQLK 

1348 

: I : : I : I : : I : I I I I : : II : I I : I I I I I : I : : 

Db 572 MH lAATLIQRRFRTLMMRRR FLSLKKTAILIQRKYRAHL-CTKHHLQFLQVQ 622 

Qy 1349 QAAITIQTRYR AKKAMQKQ WLYQKQREAIIKVQRRYRGNL 

1389 

1111:11 : II : : : I I : : I : : I : : I : I 

Db 623 NAVIKIQSSYRRWMIRKRMREMHRAATFIQSTFRMHRLHMRYQALKQASWIQQQYQANR 682 

Qy 1390 EMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWRATVQARRQR 

1449 

: I : I : II : : I I : I : : I I : : I : I I : I : : II 
Db 683 AAKLQRQHYLRQRHSAVILQAAFRGMKTRRHLKS MHSSATLIQSRSRSLLVRRR — 736 

Qy 1450EI FLSTI RKVRLMQAFI RATLLMRQQRREFEMKRRAAWIQRRFRARCAMLKARQDYQLI 

1509 

I : I : : I III: : : : I I : II : I I : I I : : I : 

Db 737 —FISLKKATIFVQRKYRATICAKHKLYQFLHLRKAAITIQSSYR RLMVKKKLQEM 790 

Qy 1510 QSS VI LVQRKFRANRSMKQARQEFVQLRT lAVHLQQKFRGKRLMI EQRNCFQLLRCS 

1566 

I : : I : I II : I : : : : I : : : M : | | M 
Db 791 QRAAVLIQATFRMHRT YITFQTWKHASILIQQHYRTYRAAKLQRE 835 

Qy 1567 MPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRR — RQKHQGLLD 

1624 

: I I I : I I II : : I I : I : M : : 



Db 836 NYIRQWHSAWIQAAYKGMKARQLLREKHKASIV 869 

Qy 1625 IRKRIAQLRQEAKAVNSVRC KVQEAVRFLRGRF 1657 

I : II I I : I I : : : : : 

Db 870 IQSTYRMYRQ YCFYQKLQWATKIIQEKY 897 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: November 18, 2005, 17:10:37 



Title: 

Perfect score : 
Sequence : 

Scoring table: 



; Search time 8 Seconds 
(without alignments ) 
262.817 Million cell updates/sec 



US-09-914-698-1 
9514 

1 MELVWSPVLEVACKETLQLI FISSVYAFDTILCKLQIDMF 1861 



BLOSUM62 
Gapop 10.0 



8323 



Gapext 0.5 

Searched: 8323 seqs, 1129788 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Published_Applications_AA_New: * 

1 : /cgn2_6/ptodata/l/pubpaa/US10_NEW_PUB . pep : * 

2 : /cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB . pep : * 

3 : /cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB . pep : * 

4 : /cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB . pep : * 

5: /cgn2_6/ptodata/l/pubpaa/US09_NEW_PUB.pep: * 

6 : /cgn2_6/ptodata/ l/pubpaa/PCT_NEW_PUB . pep : * 

7 : /cgn2_6/ptodata/l/pubpaa/USll_NEW_PUB.pep: * 

8 : /cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-11-150-406-2 

; Sequence 2, Application US/11150406 

; Publication No. US20050250164A1 

; GENERAL INFORMATION: 

; APPLICANT: Muschler, John L 

; APPLICANT: Bissell, Mina J 

; TITLE OF INVENTION: Design of Novel Assays Based on the Newly Found Role of 
; TITLE OF INVENTION: Dystroglycan and Alpha-Dystroglycan Proteolysis in 
Tumor Cell 

; TITLE OF INVENTION: Growth 
; FILE REFERENCE: IB-1398A 

; CURRENT APPLICATION NUMBER: US/11/150,406 
; CURRENT FILING DATE: 2005-06-09 



; PRIOR APPLICATION NUMBER: 60/151,766 

; PRIOR FILING DATE: 1999-08-31 

; PRIOR APPLICATION NUMBER: 09/652,493 

; PRIOR FILING DATE: 2000-08-31 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: Patentin version 3.3 

; SEQ ID NO 2 

LENGTH: 8 95 
; TYPE: PRT 

; ORGANISM: homo sapiens 
US-11-150-406-2 



Query Match 1.5%; Score 140.5; DB 7; Length 895; 

Best Local Similarity 20.3%; Pred. No. 0.00067; 

Matches 126; Conservative 84; Mismatches 252; Indels 159; Gaps 



26; 

Qy 


26 


RKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSP — TGAGKTMKSWSAAVQQKKRMSAA 


83 


Db 


207 


1 : : : : : 1 1 : I : I I 1 1 II 1 1 : 1 : 1 : 
RIDLLHRMRSFSEVELHNMKLVPWNNRLFDMSAFMAGPGNPKKWENGALLSWKLGCSL 


266 


Qy 


84 


AAPPSKQTWRVTAPSRPAAWA HP PPQAPLVEKNVYKTP 

1 M : 1 1 : : 1 II | : : : | | 
NQNSVPDIHGVEAPAREGAMSAQLGYPWGWHIANKKPPLPKRVRRQIHATPTPVTAIGP 


121 


Db 
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326 


Qy 
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QEEPVYISPQPRS LKENLSP MTPGNLL DVIDNLRFTPLT 

1 1 1 1 1 1 1 1 :: 1 II : 1 1 : 
PTTAIQEPPSRIVPTPTSPAIAPPTETMAPPVRDPVPGKPTVTIRTRGAIIQTPTLGPIQ 


160 


Db 


327 


386 


Qy 
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ETRGKGQATIFPDNLAAWPTPTLKGNVKSCA NDMRPRRITPDDLEDQPATNKT 

II II: 1 1 1 : 1 1 : 1 : 1 1 1 1 : II 1 1 
PTRVSEAGTTVPGQIR— PTMTIPGYVEPTAVATPPTTTTKKPRVSTP KPATPST 


213 


Db 


387 


439 


Qy 
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FDVKHSETINISLDTLDCSRIDGQPHT — PLNKTTTIVHATHTRALACIHEEEGPSPP — 

1 1 : 1 1 1 : : 1 1 1 1 1 1 1 1 
DSTTTTTRRPTKKPRTPRPVPRVTTKVSITRL ETASPPTR 


269 


Db 


440 


479 


Qy 


270 


-RTPTKSAIH DLKRDIKLVGSPLRKYSESMKDLSLLSPQTKYAIQGSMPNLN 

III : 1 1 1 1 : : 1 1 : j I : : : 
IRTTTSGVPRGGEPNQRPELKNHIDRVDAWVGTYFEVK 1 PSDTFYDHEDTTTDKL 


320 


Db 


480 


534 


Qy 


321 


EMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQEFLFNHSEI — LAQSSRFNLH 
: : : : 1 1 : : : | : : : | | : | : : : : | : | : | 


378 


Db 


535 


KLTLKLREQQLVGEKSWVQFNSNSQLMYGLPDSSHVGKHEYFMHATDKGGLSAVDAFEIH 


594 


Qy 


379 


EVGRKSVKGSPVK NPHKRRS — HELSFSDAPSNESL 


412 


Db 


595 


1 : 1 : : 1 1 : : : 1 : 1 : II 
VHRRPQGDRAPARFKAKFVGDPALVLNDIHKKIALVKKLAFAFGDRNCSTITLQNITRGS 


654 


Qy 


413 


YRNETVAISPPKKQRV EDTTLPRSAAPANASA RSSSAHAWPHA 

: 1 1 : : 1 1 : : : II II 1 1 : : 1 : : : 1 
IWEWTNNTLPLEPCPKEQIAGLSRRIAEDDGKPR PAFSNALEPDFKATSITVTGSG 


455 


Db 


655 


711 


Qy 


456 


QSKKFKLAQTMSLMKKP--ATPRKVRDTSIQPSVKLYDSELYMQTCINPDPFAATTTIDP 

: : : : 1 1 1 : 1 1 : I : : : | : | | | | | 
SCRHLQFIPWPPRRVPSEAPPTEVPDRDPE KSSEDDVYLHTVIPAVWAAILLIAG 


513 


Db 


712 


768 



514 FLASTMY LDEQA 525 

: I I I : : I I 

769 IIAMICYRKKRKGKLTLEDQA 789 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 
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(without alignments ) 
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Scoring table: BLOSUM62 
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Total number of hits satisfying chosen parameters: 
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Post-processing: Minimum Match 0% 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
T13845 

microtubule-associated protein - fruit fly (Drosophila melanogaster ) 
C; Species: Drosophila melanogaster 

C;Date: 20-Sep-1999 #sequence_revision 20-Sep-1999 #text_change 09-Jul-2004 
C; Accession: T13845 

R; Saunders, R.D.; Avides, M.C.; Howard, T. ; Gonzalez, C; Glover, D.M. 
J. Cell Biol. 137, 881-890, 1997 

A; Title: The Drosophila gene abnormal spindle encodes a microtubule-associated 
protein that associates with the polar regions of the mitotic spindle. 
A; Reference number: Z17792; MUID: 97296495; PMID: 9151690 
A; Accession: T13845 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1861 <SAU> 

A; Cross-references : UNIPROT : 001401 ; UNIPARC : UPI000007E201 ; EMBL:U95171; 
NID:gl930121; PID: gl930122 ; PIDN: AAB51540 . 1 
C; Genetics : 
A; Gene: asp 

A; Cross-references : FlyBase : FBgn0000140 



C; Function: 

A; Description: is required for the normal function of the mitotic spindle 

Query Match 100.0%; Score 9514; DB 2; Length 1861; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1861; Conservative 0 ; Mismatches 0 ; Indels 0 ; Gaps 0 ; 
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Db 1801 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1860 

Qy 1861 F 1861 

I 

Db 1861 F 1861 



RESULT 2 
T19957 

hypothetical protein C45G3.1 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 09-Jul-2004 
C; Accession: T19957 
R; Barlow, K. 

submitted to the EMBL Data Library, March 1997 
A;Reference number: Z19203 
A; Accession: T19957 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1186 <WIL> 

A;Cross-references: UNIPROT : 017666; UNIPARC:UPI000007D3B1; EMBL:Z92780; 

PIDN:CAB07174. 1; GSPDB : GN00019 ; CESP:C45G3.1 

A; Experimental source: clone C45G3 

C; Genetics : 

A; Gene: CESP:C45G3.1 

A;Map position: 1 

A;Introns: 21/1; 255/2; 363/2; 575/3; 893/3; 1017/2; 1042/1 

C; Superfamily : Caenorhabditis elegans hypothetical protein C45G3.1 

Query Match 3.8%; Score 361.5; DB 2; Length 1186; 

Best Local Similarity 18.6%; Pred. No. 5.1e-12; 

Matches 244; Conservative 216; Mismatches 480; Indels 375; Gaps 47; 

Qy 458 KKFKLAQTMSLMKK PATPRKVRDTSIQPSVKLYDSELYMQTCINPDPFAATTTIDPF 514 

: I II I II Ml : : I : : | : | : | | : : : : | 

Db 21 EKRLLDQVKSNTKKIDLRATERAFLESS PTSMNMRTPLNPS-ISSSVSDSPI 71 

Qy 515 LASTMYLDEQAVDRHQADFKKWLNALVSIPADLDADLNNKIDVGKLFNEV 564 

I : I I : I : : I I : : : | : : : : | : | : | 

Db 72 LS YDEKA-NKQIIALATWCNTM MELDVSEEMDLGESKAEACRNIQKMLKK 120 



Qy 565 RNKELWAPTKEEQSMNY LTKYRLETLRKAAVELFFSEQMRLPCSKVAVYVNKQALR 621 



Db 121 RSDTSEVENTQENARRRYQRIFEKNDPEWKKKCKQLLDDSGMD ASIKDLLSKNNVA 177 

Qy 622 IRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGEKIQMQSNRDIVGLSTFILNRL 681 

II: : : I : : I I : I I I : I I I : M : I : I I : I I I I : : 

Db 178 IRKEHAVYNDIGLQTTLLHTFLSFHPAWLKTALEAIFNTRIDAQPKHLMKKLSQFFLDLV 237 

Qy 682 FRN— KCEEQRYSKAY TLTEEYAETIKKHSLQKILFLLPFLDQAKQKRIVKHNPCLF 736 

II :::::: : M I : I I I : I : : : j | : : : : | 

Db 238 FSNPTMLKNKKFAQGSGKPIITEAGKEALHKHFLSVSMKLMFLIETAHTHRVIPNLTRIF 2 97 

Qy 737 VKKSPHKETKDILLRFSSELL-ANIGDITRELRRLGYVLQHRQTFLDEFDYAFNNLAVDL 795 

I I I : : I I I : : : : I : : : | | : | : : : | | : | 

Db 298 TKSSHFNCLDDVFSELTKELLTGSSATFKKAFAKVGFIPTYRQSFIENYDYQAKGFS-DF 356 

Qy 796 RDGVRLTRWEVI — LLRDDLTRQLRVPAISRLQRIFNVKLALGALGEANFQLG GDI 850 

I I : I : : : I : : : Mill I : : : I I I I : I : II : : 

Db 357 SDGLILAKLLETVGEMPHGQILLKLRDPAGDRIRKIGNVKIVLQEMS SLGVPTDNV 412 

Qy 851 AAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQRRIR H 905 

I : I I I : : : I I : II : I : I : : II : 

Db 413 NAESIVGGKKDEILSILWAII GVRVAKEQRIKVTRVSE 450 

Qy 906 KELMRRHRAA TVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQ 957 

: : : I : I : III : : : : : I I : 

Db 451 ERTPKKRRSAVHDDMSSEVLKMCKIYGRQME — lEVMDLDSLSDGCLLAKLWTTFGTNST 508 

Qy 958 LYQSYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQR— IWRRRLFAKKLLAAAETA 1015 

II : I : : I : I : : I I I I I I 
Db 509 PIQDYDG LSLW EKWSVAELELCIQRGLDQNMALFVKMFL E 54 9 

Qy 1016 RLQRSQKQQAAASYIQMQWRTYQLGRIQRHE FL RQRDLIMF 1056 

M I I : I I I : I : I I I : I I : I 

Db 550 RLGMIQDLNEKATKIQRMWKAY VQRKNTPKLYFIVQQLLADSSIPRNRSVSPFSNN 605 

Qy 1057 VQRRMRSKWSMLEQRKEFQQLKRAAINIQQRWRAKLSMRKCNADYLALRSSVLKVQ 1112 

I I I : : : I : I I : : I : : I : I I : : 

Db 606 VTFTVPRTPRN — NILTERPSLSQIPSS RQSMDSTFNDATFTVSRDSIESMN 655 

Qy 1113 AYRKA TIQMRIDRNHYYSLRKNVICLQQRLRAIMKMREQRENYLRLRNASI 1163 

: I Mil: : I : : I : : : I I : 
Db 656 KMQKTPLRGTFTRKTIAMVIEEEDDSENNETWPSTLKKRTWRMEHNAEVF 707 

Qy 1164 LVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATLQMRRERKNYLHLQTTTKRIQIKF 1223 

I : I I I : : I : I : I : I :: 

Db 708 REQDEDDEN QDKDTVAPSAE NLDSPPSDIPLET 740 

Qy 1224 RAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKERQEYLHLREVTIKLQRRFHAQKSM 1283 

: I I I I I : II 
Db 741 LSSIPSASQSAIFLQDSE TGKEM 763 

Qy 128 4 RFMRAKYRGT QAAVSCLQMHWRNHLLRKRERNSFLQLRQAAITLQRRYRARLNMIKQ 1340 

: I : I : I : I : : | : | : | : M : 

Db 764 HVPKAEDVGVWLEASDSPVALEGNN EASYDGQKIENLETFEIKE 808 

Qy 1341 LKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIKVQRRYRGNLEMRKQIEVYQK 1400 

I : I : I : : : : : | : : | : : : | : I : 



Db 



809 GKTQEDLPSKSPMDPTQTSGSPLVEFRMTEEQERLEMLFQSLSEDQKNFVKTNNLSVSIE 868 



Qy 14 01 QRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWRATVQARRQREIFLSTIRKVR 1460 

I : : I I : : : I : I : I I I 
Db 8 69 DDANTPELRRILRQTRELK RKQQEI AR 895 

Qy 14 61 LMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAMLKARQDYQLIQSSVILVQRKF 1520 

: I I :| : II : I II ::|: I |: 

Db 896 KLGNIERNALAVRDGGEDSSDSRSDA GHDVAILHGDDSQLFENSMQLDQK — 945 

Qy 1521 RANRSMKQARQEFVQLRTIAVHLQQKFRGKRLMIEQRNCFQLLRCSMPGFQARARGFMAR 1580 

Db 946 945 

Qy 1581 KRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQGLLDIRKRIAQ LRQEA 1636 

I I I : I : I I I I I : I I : : I I : : : : : || | : | | | | 
Db 94 6 SQLQNDETQILENKKKAAWIQKMIRGFIARRKFQME-ISNIRNRMIQYNHILAQED 1001 

Qy 1637 KAV NSVRCKVQEAVRFLRGRFIASDALAVL SQLDRLSRTVPHLL 1680 

: : II I : : : II : : I I I : : : I I : : I I I I 

Db 1002 EQIGIEEMEDKSVEAKLKKCA— LHG— LTNDNLHWHVAATVIDRVTDLVPSLL 1052 
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RESULT 1 
ASP_DROME 

ID ASP.DROME STANDARD; PRT; 1954 AA. 

AC Q9VC45; O01401; Q8SX66; 

DT 25-OCT-2004 (Rel. 45, Created) 

DT 25-OCT-2004 (Rel. 45, Last sequence update) 

DT lO-MAY-2005 (Rel. 47, Last annotation update) 

DE Abnormal spindle protein. 

GN Name=asp; ORFNames=CG6875 ; 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpna; 

OC Ephydroidea; Drosopnilidae; Drosophila. 

ox NCBI_TaxlD=7227; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAlN=Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; DOI=10.1126/science.287. 5461. 2185; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 

RA Amanatides P.G., scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A. , Lewis S.E., Richards s., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazei R.G., Champe M. , Pfeiffer B.D., 

RA wan K.H., Doyle c, Baxter E.G., Helt G. , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C, Baldwin D. , 

RA Ballew R.M., Basu A., Baxendale J., Bayraktarogl u L., Beasley E.M-, 

RA Beeson K.Y., Benos P. v., Berman B.P., Bhandari D. , Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C, Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA cherry 3.M., Cawley S., Dahlke c, Davenport L.B., Davies P., 

RA de Pablos B., Delcner A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. , 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 
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RA Glodek A., Gong F. , Gorrell 3.H., Gu z., Guan P., Harris M., 

RA Harris N.L., Harvey D.A. , Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D. , Houston K.A., Howland T.J., Wei M.-H., Ibegwam C. , 

RA Jalali M., Kalush F., Karpen G.H., Ke z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. , Levi tsky A.A. , Li 3.H., Li Z. , Liang Y.. Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G. , Milshina N.V., Mobarry c, Morris J., Moshrefi A., 

RA Mount s,M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K. , Nusskern D.R,, Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert K,, Remington K. , Saunders R.D.C., Scheeler F., Shen H., 

RA shue B.C., siden-Kiamos I., Simpson M. , skupski M.P., Smith T. , 

RA Spier E., spradling A.C., stapleton M. , strong R, , sun E., 

RA, Svirskas R. , Tector C. , Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T. , Worley K.C, Wu D. , Yang s., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q. , Zhena L., 

RA Zheng X.H., Zhong F.N., Zhong W. , Zhou X., Zhu s., Zhu X., smith H.O., 

RA Gibbs R.A., Myers E.W. , Rubin G.M., venter J.c; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL science 287:2185-2195(2000). 

RN [2] 

RP GENOME REANNOTATION . 

RX MEDLINE=22426069; PubMed=12537572 ; 

RA Misra S., Crosby M.A. , Mungall C.J., Matthews B.B., Campbell K.S., 

RA Hradecky P., Huang Y. , Kammker J.S., Millburn G.H., Prochnik S.E., 

RA Smith CD., Tupy J.L., Whitfield E.J., Bayraktarogl u L., Berman B.P., 

RA Bettencourt B.R., Celniker S.E., de Grey A. D.N. J., Drysdale R.A., 

RA Harris N.L., Richter J., Russo s., Schroeder A.J., Shu S.Q., 

RA Stapleton M., Yamada C, Ashburner M. , Gelbart W.M., Rubin G.M., 

RA Lewi s S . E . ; 

RT "Annotation of the Drosophila melanogaster euchromatic genome: a 

RT systematic review."; 

RL Genome Biol. 3:RESEARCH0083.1-RESEARCH0083.22(2002) . 

RN [3] 

RP NUCLEOTIDE SEQUENCE OF 94-1954, FUNCTION, SUBCELLULAR LOCATION, AND 

RP DEVELOPMENTAL STAGE. 

RC STRAlN=Oregon-R; 

RX PubMed=9151690 ; 001=10 . 1083/jcb . 137 . 4 . 881; 

RA Saunders R.D.C, do carmo Avides M. , Howard T.I. A., Gonzalez C, 

RA Glover D.M. ; 

RT "The Drosophila gene abnormal spindle encodes a novel microtubule- 

RT associated protein that associates with the polar regions of the 

RT mitotic spindle."; 

RL J. cell Biol. 137:881-890(1997). 

RN [4] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 700-1954. 

RC STRAlN=Berkeley; TlSSUE=Embryo; 

RX MEDLINE=22426066; PubMed=12537569 ; 

RA Stapleton M., Carlson J.W., Brokstein P., Yu C, Champe M., 

RA George R.A., Guarin H., Kronmiller B., Pacleb J.M., Park S., Wan K.H., 

RA Rubin G.M., Celniker S.E.; 

RT "A Drosophila full-length cDNA resource."; 

RL Genome Biol. 3 :RESEARCH0080.1-RE5EARCH0080. 8(2002) . 

RN [5] 

RP FUNCTION, AND SUBCELLULAR LOCATION. 

RX PubMed=10073938 ; DOI=10 . 1126/sci ence . 283 . 5408 . 173 3 ; 

RA do Carmo Avides M., Glover D.M.; 

RT "Abnormal spindle protein. Asp, and the integrity of mitotic 

RT centrosomal microtubule organizing centers."; 

RL science 283:1733-1735(1999). 

RN [6] 
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RP FUNCTION. 

RX PubMed=15242765 ; DOI=10 . 1016/j . yexcr . 2004 . 03 . 054 ; 

RA Riparbelli M.G., Massarelli C, Robbins L.G,, Callaini G. ; 

RT "Tne abnormal spindle protein is required for germ cell mitosis and 

RT oocyte differentiation during Drosophila oogenesis."; 

RL Exp. cell Res. 298:96-106(2004). 

CC -!- FUNCTION: Required to maintain the structure of the centrosomal 
CC microtubule-organizing center (MTOC) during mitosis. May have a 

CC preferential role in regulating neurogenesis. Required for germ 

CC cell mitosis and oocyte differentiation. 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic and nuclear. During interphase 

CC in syncytial embryos distribution is cytoplasmic, on entering 

CC mitosis, moves to polar regions of the spindle immediately 

CC surrounding the centrosome. At telophase, migrates to microtubules 

CC on the spindle side of both daughter nuclei. The nuclear- 

CC cytoplasmic distribution could be regulated by the availability of 

CC calmodulin. 

CC -!- DEVELOPMENTAL STAGE: Expressed both maternally and zygotically in 
CC embryos. 

CC -!- SIMILARITY: Contains 1 CH (calponin-homology) domain. 

CC -!- SIMILARITY: Contains 5 IQ domains. 

CC -!- CAUTION: Ref.4 sequence differs from that shown due to intron 
CC retention. 

CC 

CC This Swiss-Prot entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use as long as its content is in no way modified and this statement is not 

CC removed. 

CC 

DR EMBL; AE003749; AAF56330.3; -; Genomic_DNA, 

DR EMBL; U95171; AAB51540.1; mRNA. 

DR EMBL; AY094825; AAM11178.1; ALT„SEQ; mRNA. 

DR PIR; T13845; T13845. 

DR Ensembl ; CG6875; Drosophila melanogaster. 

DR FlyBase; FBgn0000140; asp. 

DR GO; GO:0005875; C:microtubule associated complex; IDA. 

DR GO; GO:0005815; C:microtubule organizing center; IDA. 

DR GO; GO:0008017; F:microtubule binding; IDA. 

DR GO; GO:0004672; F:protein kinase activity; IDA. 

DR GO; GO:0000226; P:microtubule cytoskeleton organization and b. . .; IDA. 



DR interPro; IPR001715; Calponin„act_bd. 

DR InterPro; IPR000048; IQ_caM_bd_region. 

DR Pfam; PF00307; CH; 1. 

DR Pfam; PF00612; IQ; 19. 

DR SMART; SM00033; CH; 1. 

DR SMART; SM00015; IQ; 16. 

DR PROSITE; PS50021; CM; 1. 

DR PROSITE; PS50096; IQ; 5. 

KW cal modul i n-bi ndi ng ; Cel 1 cycl e ; Cel 1 di vi si on ; Coi 1 ed coi 1 ; 

KW Devel opmental protei n ; Di f f erenti ati on ; Mi crotubul e ; Mi tosi s ; 

KW Nucl ear protei n ; oogenesi s ; Repeat . 



FT 


DOMAIN 


836 


968 


CH. 




FT 


DOMAIN 


1004 


1033 


IQ 1. 




FT 


DOMAIN 


1386 


1415 


IQ 2. 




FT 


DOMAIN 


1467 


1496 


IQ 3. 




FT 


DOMAIN 


1656 


1687 


IQ 4. 




FT 


DOMAIN 


1690 


1721 


IQ 5. 




FT 


COILED 


1614 


1641 


Potential . 




FT 


COMPBIAS 


1063 


1749 


Arg-rich. 




FT 


CONFLICT 


811 


811 


L -> P (in 


Ref . 


FT 


CONFLICT 


898 


898 


M -> V (in 


Ref , 


FT 


CONFLICT 


1129 


1129 


S -> T (in 


Ref. 
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FT CONFLICT 1138 1139 QQ -> HE (in Ref. 3). 

FT CONFLICT 1761 1761 R -> Q (in Ref. 1). 

SQ SEQUENCE 1954 AA; 230180 I^IW; 4912b4e20CA9E659 CRC64; 

Query Match 99.7%; Score 9482; DB 1; Length 1954; 

Best Local Similarity 99.7%; Pred. No. 0; 

Matches 1855; Conservative 4; Mismatches 2; Indels 0; Gaps 0; 



Qy 


1 


MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 


60 


Db 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

IllllllllllllllllltllllllllllllllllllllllllllllllllJIIIIIIII 




94 


MELVWSPVLEVACKETLQLIDNRNFRKEVMIILKSKSNQPVKNPRKFPTVGKTLQLKSPT 


153 


Qy 


61 


GAGKTMKSWSAAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 


120 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


154 


GAGKTMKSWSAAVQQKKRMSAAAAPPSKQTWRVTAPSRPAAWAHPPPQAPLVEKNVYKT 


213 


Qy 


121 


PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 


180 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 




Db 


214 


PQEEPVYISPQPRSLKENLSPMTPGNLLDVIDNLRFTPLTETRGKGQATIFPDNLAAWPT 


273 


Qy 


181 


PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 


240 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 i 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


274 


PTLKGNVKSCANDMRPRRITPDDLEDQPATNKTFDVKHSETINISLDTLDCSRIDGQPHT 


333 


Qy 


241 


PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 


300 




1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 i i 
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 




Db 


334 


PLNKTTTIVHATHTRALACIHEEEGPSPPRTPTKSAIHDLKRDIKLVGSPLRKYSESMKD 


393 


Qy 


301 


LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 


360 




1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


394 


LSLLSPQTKYAIQGSMPNLNEMKIRSIEQNRYYQEQQIQIKAKDLNSSSSSEASLAGQQE 


453 


Qy 


361 


FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 


420 




1 1 1 1 M M M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


454 


FLFNHSEILAQSSRFNLHEVGRKSVKGSPVKNPHKRRSHELSFSDAPSNESLYRNETVAI 


513 


Qy 


421 


SPPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTM5LMKKPATPRKVRD 


480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


514 


SPPKKQRVEDTTLPRSAAPANASARSSSAHAWPHAQSKKFKLAQTMSLMKKPATPRKVRD 


573 


Qy 


481 


TSIOPSVKLYDSELYMOTCINPDPFAATTTIDPFLASTMYLDEOAVDRHOADFKKWLNAL 


540 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • 1 


633 


Db 


574 


TSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDRHQADFKKWLNAL 


Qy 


541 


VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRLETLRKAAVELFF 


600 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 




Db 


634 


VSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKVRLETLRKAAVELFF 


693 


Qy 


601 


SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 


660 




IIIIIIIIIIIIIIIMIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIII 




Db 


694 


SEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPLWLRLGLEWFGE 


753 


Qy 


661 


KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLPFL 


720 




lllllllllllllllllllllllllllllllllllllllllllllllllllllllll II 




Db 


754 


KIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAYTLTEEYAETIKKHSLQKILFLLLFL 


813 


Qy 


721 


DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 


780 




IIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIII 




Db 


814 


DQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELRRLGYVLQHRQTF 


873 


Qy 


781 


LDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 


840 




lllllllllllllllllllllllhlllllMIIIIIIIIIIIIIIIIIIIIIIIIIIII 




Db 


874 


LDEFDYAFNNLAVDLRDGVRLTRVMEVILLRDDLTRQLRVPAISRLQRIFNVKLALGALG 


933 
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Qy 


841 


Db 


934 


Qy 


901 


Db 


994 


Qy 


961 


Db 


1054 


Qy 


1021 


Db 


1114 


Qy 


1081 


Db 


1174 


Qy 


1141 


Db 


1234 


Qy 


1201 


Db 


1294 


Qy 


1261 


Db 


1354 


Qy 


1321 


Db 


1414 


Qy 


1381 


Db 


1474 


Qy 


1441 


Db 


1534 


Qy 


1501 


Db 


1594 


Qy 


1561 


Db 


1654 


Qy 


1621 


Db 


1714 


Qy 


1681 


Db 


1774 


Qy 


1741 



EANFQLGGDIAAQDIVDGHREKTLSLLWQLIVKFRSPKFHAAATVLQKWWRRHWLHWIQ 900 

IIIMIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIMII 

EANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQKWWRRHWLHWIQ 993 
RRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRVLAQKQLYQ 960 

IIIIIIIIIIIIIIIIIIIIIIIIMIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIII 

RRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKFTRRYLAQKQLYQ 105 3 
SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 1020 

IIIIIIIMIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

SYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKLLAAAETARLQRS 



QKQQAAASYIQMQWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 

IIIIIIIIIIIMIhllllllM :|lllllllilllllllllllllllllllllllll 

QKQQAAASYIQMQWRSYQLGRIQRQQFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRA 
AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 

IIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

AINIQQRWRAKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQ 



1113 
1080 
1173 
1140 
1233 



RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 1200 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
RLRAIMKMREQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATL 



QMRRERKNYLHLQ 

lllllllllllll 
QMRRERKNYLHLQ 



1293 



KRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 1260 

lllllilllllllllllllllllllllllMIIIIIIIIIIIIIII 

KRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKER 1353 

HLLRKRERNSFLQL 1320 



IIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
QEYLHLREVTIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQL 

RQAAITLQRRYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIK 
llllllllllllllllllllllllilllllllllllllllllllllllllllMIIIIII 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIII 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIMIIII 



IIIIIIIIMIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIhllllllllllll 



IIIMIIIIIIMIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIII 



IIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIMIIIIIIIIIIMIM 



1413 
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Db 1834 LLRWCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQ 1893 

Qy 1801 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1860 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIII 

Db 1894 NARKPPPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCKLQIDM 1953 

Qy 1861 F 1861 

I 

Db 1954 F 1954 

RESULT 2 
Q7QAG9^N0GA 

ID q7qag9^noga preliminary; PRT; 1399 AA. 

AC Q7QAG9 ; 

DT Ol-MAR-2004 (TrEMBLrel . 26, Created) 

DT Ol-MAR-2004 (TrEMBLrel. 26, Last sequence update) 

DT Ol-MAR-2004 (TrEMBLrel. 26, Last annotation update) 

DE ENSANGP00000021262 (Fragment). 

GN ORFNames=ENSANGG00000018773 ; 

OS Anopheles gambiae str. PEST. 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

oc Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; culicidae; 

OC Anophelinae; Anopheles. 

OX NCBI„TaxID=180454; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=PEST; 

RG The Anopheles qambiae Sequence Committee; 

RT "Anopheles gambiae re-annotation."; 

RL Submitted (APR-2002) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=PEST; 

RG The Anopheles gambiae Sequence Committee; 

RL Submitted (APR-2004) to the EMBL/GenBank/DDBJ databases. 

CC -!- CAUTION: The sequence shown here is derived from an 

CC EMBL/GenBank/DDBJ whole genome shotgun (WGS) entry which is 

CC preliminary data. 

DR EMBL; AAAB01008888; EAA08939.2; -; Genomic_DNA. 

DR interPro; IPR001715; Calponin-like. 

DR InterPro; IPR000048; lQ_region. 

DR Pfam; PF00307; CH; 1. 

DR Pfam; PF00612; IQ; 25. 

DR SMART; SM00033; CH; 1. 

DR SMART; SM00015; IQ; 19. 

DR PROSITE; PS50021; CH; 1. 

DR PROSITE; PS50096; IQ; 5. 

FT NON^TER 1 1 

SQ SEQUENCE 1399 AA; 166192 MW; 4CB3630FF708E6E7 CRC64; 

Query Match 32.5%; Score 3093; DB 2; Length 1399; 

Best Local Similarity 44.3%; Pred. No. 2.5e-151; 

Matches 640 ; Conservati ve 284 ; Mi smatches 421; Indel s 100 ; Gaps 9 ; 

Qy 469 MKKPATPRKVRDTSIQPSVKLYDSELYMQTCINPDPFAATTTIDPFLASTMYLDEQAVDR 528 

:|: I I : I : I Mil:- :::| I I I I I I I I I I I : I I I llllll:| :: 
Db 1 LKRTAVPCSLPPKSEEKRVFLYDSDRHLKTLINPDPFAATTTCNPFLTVTMYLDERAFEQ 60 

Qy 529 HQADFKKWLNALVSIPADLDADLNNKIDVGKLFNEVRNKELWAPTKEEQSMNYLTKYRL 588 

:: llllllll:|lllll : I : I I I I I I : I I : : I I I :||||| I I I II 
Db 61 YERQMKKWLNALVTIPADLDTEPNKPLDVGKLFDEVKSKELTLAPTKELISSKYY-KTRL 119 
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Qy 589 ETLRKAAVELFFSEQMRLPCSKVAVYVNKQALRIRSDRNLHLDWMQRTILELLLCFNPL 648 

II I : I: II:: :| III : II I : I : I I I I I I I : I : I I : I I I I I I I I I I I 

Db 120 NHLRSAGIALYTSEEIAMPLRKVAAQIEKQLLSLRTDRNLHLDLVLQRSILELLLCFNPL 179 

Qy 649 WLRLGLEWFGEKIQMQSNRDIVGLSTFILNRLFRNKCEEQRYSKAVTLTEEVAETIKKH 708 

llllllllllll:|::|||||||||||||::||||:: I I MM l: III ::| 
Db 180 WLRLGLEWFGEQIELQSNRDIVGLSTFIIHRLFRDRYLEARNSKAVNLSRAYAEHMRKF 239 

Qy 709 SLQKILFLLPFLDQAKQKRIVKHNPCLFVKKSPHKETKDILLRFSSELLANIGDITRELR 768 

:|: :llll III I I : : : : : I I I I I I I I : : II I I I I : I I : II : I : I : : Mill: :: 
Db 240 TLRMVLFLLLFLDTAKRRKLIKHNPCLFVRNAPHKETKEILIRFASQLVSGIGDITKHMK 299 

Qy 769 RLGYVLQHRQTFLDEFDYAFNNLAVDLRDGVRLTRWEVILLRDDLTRQLRVPAISRLQR 828 

1:1111 |:|:||||::||| I I I I II II I I I I I I I : I : I I I I I I I : MM Mill: 
Db 300 RVGYVLSHKQSFLDEYNYAFENLAVDLRDGVRLTRVMEIILLRDDLSASLRVPPISRLQK 359 

Qy 829 IFNVKLALGALGEANFQLGGDIAAQDIVDGHREKTLSLLWQLIYKFRSPKFHAAATVLQK 888 

I I: Ml II :|:::: l:: MM II I I I : I : I I M I : : II I I : I I I : I I I III: 

Db 360 IHNINLALVALEQADYKIAGNVTAKDICDGHREQTMSLLWQIVYKFRAPKFNAAAIVLQR 419 

Qy 889 WWRRHWLHWIQRRIRHKELMRRHRAATVIQAVFRGHQMRKYVKLFKTERTQAAIILQKF 948 

III :ll Mill I :|| II III II: :| : : : :: :| : :|:| 
Db 420 WWRMNWLK\n"ISRRIEEKRALRREAAARTIQAAVRGYCVRVWYEAHRRQKLRAIVTIQRF 479 

Qy 949 TRRYLAQKQLYQSYHSIITIQRWWRAQQLGRQHRQRFVELREAAIFLQRIWRRRLFAKKL 1008 

:||IIIM : : :|: IMIII : II MIM l::|| II :|| :|| 

Db 480 SRRYLAQKLAARRFSAIVRIQQWWRTVRQMRQARERFLLCRKSAIVLQTSYRRYALGRKL 539 

Qy 1009 LAAAE TARLQRSQKQQAAASY 1029 

c.n II" I ::| : I :|: 

Db 540 LAAATLIGQIRAEAKHRHLQATIIQRSIKSYVIHRRLHATVNGMVAFIRRKRLQNRSAAK 599 

Qy 1030 IQMQWRTYQLGRIQRHEFLRQRDLIMFVQRRMRSKWSMLEQRKEFQQLKRAAINIQQRWR 1089 

II II II I MM I : :||| I : I I :: :|| :||::| 

Db 600 IQ AYQRMRIVRKEYLRSRSAAICIQRRWRECMEARQLRNRFLLMRASAIRLQQQYR 655 

Qy 1090 AKLSMRKCNADYLALRSSVLKVQAYRKATIQMRIDRNHYYSLRKNVICLQQRLRAIMKMR 1149 

I I : I I : : : : II : I : I I : I : I : I M I : I : I II M 
Db 656 GWRQMRQDRHTYANARNLIVQVQRRWRGTLAMRKERANYRTLRR\n"INVQRRF^^ 715 

Qy 1150 EQRENYLRLRNASILVQKRYRMRQQMIQDRNAYLRTRKCIINVQRRWRATLQMRRERKNY 1209 

_ : I I I M: :|:|:l : M: I I I : IIIMII III I :| 

Db 716 SEVERYRTLCKATVTLQQRFRANKAMMEQRQQYNSLRVATLCVQRRFRAQLSMRAARASY 775 

Qy 1210 LHLQTTTKRIQIKFRAKREMKKQRAEFLQLKKVTLWQKRRRALLQMRKERQEYLHLREV 1269 

:: II ::ll |: I I : I : : I : I I I I I : I I :| I :|: 
Db 776 AKVRCAILTIQSQYRATLAMRHARDRFVTLRRCTITVQARFRAILAGRAAKQRYESIRKA 835 

Qy 1270 TIKLQRRFHAQKSMRFMRAKYRGTQAAVSCLQMHWRNHLLRKRERNSFLQLRQAAITLQR 1329 

I: :ll:: I M :|: II I II II II::: M :| I II III 

Db 836 TLHIQRKWRATLEMRQVRSHYRRQCNAALTLQRSWRGVLLQRKFRHDYLLYRGAATVLQR 895 

Qy 1330 RYRARLNMIKQLKSYAQLKQAAITIQTRYRAKKAMQKQWLYQKQREAIIKVQRRYRGNL 1389 

MM : : : IMIM III I : : : |:::: IMMI I 

Db 896 RYRALVQGRMVRREMQHCRWAAVTIQRRLRATLQMNRDRKAFLQLRQSVLWQRRFRAN- 954 

Qy 1390 EMRKQIEVYQKQRQAVIRLQKWWRSIRDMRLCKAGYRRIRLSSLSIQRKWRATVQARRQR 1449 

I I: : I :: |:::| :l IM MM 

Db 955 RACRVQRVQYAALKRSAITISHRWAATLHMRQQR 988 

Qy 1450 EIFLSTIRKVRLMQAFIRATLLMRQQRREFEMKRRAAWIQRRFRARCAMLKARQDYQLI 1509 

II : I I II : I : : : I 1 I I : : II : : II : I I I I : : 

Db 989 SDFLRLKSATWMQRRYRAQRAKQQAVQQYERMRAAIVLLQRKYRAQRAM 1048 
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Qy 1510 QSSVILVQRKFRANRSMKQARQEFVQLRTIAVHLQQKFRGKRL MIEQRNCFQLLRC 1565 

:|: I:|l :l I l::|| : :|::|||| I ::: I |: :| 

Db 1049 KSASIWQEFYRGYRNMRHDRAAFIRLRESVLAIQRRFRGKLLTRQTWDLR— FEQIRR 1106 

Qy 1566 SMPGFQARARGFMARKRFQALMTPEMMDLIRQKRAAKVIQRYWRGYLIRRRQKHQGLLDI 1625 

:: I I II :||: I Ihlil :: l|::|| II :||| |:| : : I 

Db 1107 TVRGLQTYGRGVLARRAFLALLTPEYLERKRQQKAALRIQAWWRGAYHRKRYQTMQMRKI 1166 

Qy 1626 RKRIAQLRQEAKAVNSVRCK— VQEAVRFLRGRFIASDALAVLSQLDRLSRTVPHLLMWC 1683 

: : : | | : : ; | ; : I I I : I I : I : I : : I : I : I : II I I I I I I 
Db 1167 AQQMVASRMAARRDPTIRLSNVSRLCLRFLKTRFSSSEAIGILKRLERMSRLVPHLLMED 1226 

Qy 1684 SEFMSTFCYGIMAQAIRSEVDKQLIERCSRIILNLARYNSTTVNTFQEGGLVTIAQMLLR 1743 

: I:| III :|llllllllll III hllllllll: I III lll|::||||| 
Db 1227 AVFLSVFCYNMMAQAIRSEVDKILIEICARIILNLARFRGTKEQAFQEDGLVTVSQMLLR 1286 

Qy 1744 WCDKDSEIFNTLCTLIWVFAHCPKKRKIIHDYMTNPEAIYMVRETKKLVARKEKMKQNAR 1803 

Mill ll:|llll:|l II II: I II : :|IM:|IIIIII lllll::| : 
Db 1287 WCDKDCGIFSTLCTLLWVLAHDNKKKNAIRRYMISKDAIYMLRETKKLVQRKEKMRKNVQ 1346 

Qy 1804 KP PPMTSGRYKSQKINFTPCSLPSLEPDFGIIRYSPYTFISSVYAFDTILCK 1855 

:| I : ::|lllllll: I II I III: |: :| 

Db 1347 RPVGCLVAPNPQLMR TVPSLEPDFGVNRSKPYVFYSSVFGFERVLQM 1393 

Qy 1856 LQIDM 1860 

I::|: 

Db 1394 LEVDL 1398 
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